Linguistic Annotation of Two Prosodic Databases
نویسنده
چکیده
Two prosodic databases were annotated with linguistic information using SGML (Standard General Markup Language), one database of American English and one of Modern Standard German. Only information that might have prosodic correlates was annotated. Pho-netic and morphological information was supplied by automatic tools and then hand corrected. Semantic and pragmatic information was inserted by hand. The SGML tagset is essentially the same for both languages. Tags delimit structural units, all other information is supplied by attributes. The databases themselves , which combine linguistic and phonetic information, are stored as SPSS les.
منابع مشابه
Optimization of MFNs for signal-based phrase break prediction
The automatic prosodic annotation of large speech corpora gains increasing consideration since appropriate databases for the training of prosodic models in speech synthesis and recognition are needed. On linguistic level, correct phrase and accent marking are essential processing steps. The authors developed a neural network based method for signal-based phrase break prediction and tested this ...
متن کاملDesign and Evaluation of Shared Prosodic Annotation for Spontaneous French Speech: From Expert Knowledge to Non-Expert Annotation
In the area of large French speech corpora, there is a demonstrated need for a common prosodic notation system allowing for easy data exchange, comparison, and automatic annotation. The major questions are: (1) how to develop a single simple scheme of prosodic transcription which could form the basis of guidelines for non-expert manual annotation (NEMA), used for linguistic teaching and researc...
متن کاملQuerying Databases of Annotated Speech
Annotated speech corpora are databases consisting of signal data along with time-aligned symbolic ‘transcriptions’. Such databases are typically multidimensional, heterogeneous and dynamic. These properties present a number of tough challenges for representation and query. The temporal nature of the data adds an additional layer of complexity. This paper presents and harmonises two independent ...
متن کاملForm versus Function – Prosodic Annotation and Modeling go Hand in Hand
This paper argues that prosodic annotation and modeling should be combined for facilitating analyses of prosodic functions that invariably require perceptual judgments. It compares perceptual prosodic annotations of prominent syllables and phrase boundaries with labels yielded by the combination of linguistic information from a TTS-front end, model-based prosodic features, as well as a model of...
متن کاملConsistency Maintenance in Prosodic Labeling for Reliable Prediction of Prosodic Breaks
For the implementation of the prosody prediction model, large scale annotated speech corpora have been widely applied. Reliability among transcribers, however, was too low for successful learning of an automatic prosodic prediction. This paper reveals our observations on performance deterioration of the learning model due to inconsistent tagging of prosodic breaks in the established corpora. Th...
متن کامل